Locality-Driven Scheduling of Tasks for Data-Dependent Multithreading

نویسندگان

Jaime Arteaga

Stephane Zuckerman

Elkin Garcia

Robert Pavel

Guang Gao

چکیده

The amount of data movement in an application has a direct impact on both its execution time and power consumption. One way to reduce this, is the implementation of locality-aware scheduling algorithms to maximize the reuse of data during the assignment of work to hardware threads. Locality-Driven Code Scheduling (LDCS), an example of such algorithms, groups the tasks that process a common data block as phases of a single coarse-grain construct named super-task, with each phase being fired according to dataflow semantics. LDCS reduces the number of long latency operations by executing all the phases of a super-task with the same hardware thread and by reading and writing the data block from and to main memory only with the first and last phases of the super-task, while the others rely on the presence of the block in the upper levels of the memory hierarchy. This paper analyzes the impact that LDCS can have on the execution time and the power consumption of an application, and presents experimental results performed on two systems: one with a software-managed memory hierarchy and another one with hardware data caches, showing that LDCS can improve the power efficiency of an application up to 72% and by 28% on average, respectively, for weak scaling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure

Efficiently scheduling MapReduce tasks is considered as one of the major challenges that face MapReduce frameworks. Many algorithms were introduced to tackle this issue. Most of these algorithms are focusing on the data locality property for tasks scheduling. The data locality may cause less physical resources utilization in non-virtualized clusters and more power consumption. Virtualized clust...

متن کامل

Array Regrouping on CMP with Non-uniform Cache Sharing

Array regrouping enhances program spatial locality by interleaving elements of multiple arrays that tend to be accessed closely. Its effectiveness has been systematically studied for sequential programs running on unicore processors, but not for multithreading programs on modern Chip Multiprocessor (CMP) machines. On one hand, the processor-level parallelism on CMP intensifies memory bandwidth ...

متن کامل

Simulation Study of Multithreaded Virtual Processor

This paper proposes the Multithreaded Virtual Processor (MVP) architecture model as a means of integrating the multithreaded programming paradigm and a modern superscalar processor with support for fast context switching and thread scheduling. In order to validate our idea, a simulator was developed using a POSIX compliant Pthreads package and a generic superscalar simulator called SimpleScalar...

متن کامل

Effects of Multithreading on Cache Performance

ÐAs the performance gap between processor and memory grows, memory latency becomes a major bottleneck in achieving high processor utilization. Multithreading has emerged as one of the most promising and exciting techniques used to tolerate memory latency by exploiting thread-level parallelism. The question, however, remains as to how effective multithreading is on tolerating memory latency. The...

متن کامل

Cache-Affinity Scheduling for Fine Grain Multithreading

Cache utilisation is often very poor in multithreaded applications, due to the loss of data access locality incurred by frequent context switching. This problem is compounded on shared memory multiprocessors when dynamic load balancing is introduced and thread migration disrupts cache content. In this paper, we present a technique, which we refer to as ‘batching’, for reducing the negative impa...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Locality-Driven Scheduling of Tasks for Data-Dependent Multithreading

نویسندگان

چکیده

منابع مشابه

Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure

Array Regrouping on CMP with Non-uniform Cache Sharing

Simulation Study of Multithreaded Virtual Processor

Effects of Multithreading on Cache Performance

Cache-Affinity Scheduling for Fine Grain Multithreading

عنوان ژورنال:

اشتراک گذاری